SOCIAL NETWORKS ANALYSIS

Profesor: ALVARO ROMERO MIRALLES

Program: MBD April intake

Group Final Assignment

Team A

The following dataset is going to be used: * OpenFlights dataset - directed graph - : “The data is downloaded from Openflights.org. a directed network containing flights between airports of the world, in which directed edge represents a flight from one airport to another from the year 2010. Here it has 2939 nodes and 30501 edges. As such, it gives much more of a complete picture and avoids the sample selection. The weights in this network refer to the number of routes between two airports.” http://opsahl.co.uk/tnet/datasets/openflights.dl

metadata downloaded from: https://openflights.org/data.html https://raw.githubusercontent.com/jpatokal/openflights/master/data/airports.dat

some of the airport ids are missing in our data. we assume that these airports can be some priavetely owned airports or military bases, and therefore not included in the airports dataset.

We had to download the data and edit the headers for us R to be able to read the read_graph( “http://opsahl.co.uk/tnet/datasets/openflights.dl”,format = c(“dl”)) was giving us an error. as for the metadata, for readablilty we changed file to csv.

both edited files are in the downloaded folder

Loading data

# Set the folder (path) that contains this R file as the working directory
#dir <- dirname(rstudioapi::getActiveDocumentContext()$path)
#setwd(dir)

library(igraph)
## Warning: package 'igraph' was built under R version 3.5.3
## 
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum
## The following object is masked from 'package:base':
## 
##     union
library(data.table)
## Warning: package 'data.table' was built under R version 3.5.3
library(stringr)
## Warning: package 'stringr' was built under R version 3.5.3
library(igraphdata)
## Warning: package 'igraphdata' was built under R version 3.5.3
openflights <- fread("openflights.dl.tsv", header = F)
colnames(openflights) <- c("from", "to", "weight")
metadata <- read.csv("airports.dat.csv", header = T)
metadata <- metadata
flights <- graph.data.frame(as.data.frame(openflights)) 
head(openflights)
##    from   to weight
## 1:    1    5      1
## 2:    2    4      1
## 3:    2    5      1
## 4:    2    6      2
## 5:    2 5430      1
## 6:    3    2      1

Summary Analysis

summary(flights)
## IGRAPH ca91f15 DNW- 2939 30501 -- 
## + attr: name (v/c), weight (e/n)

The summary of this graph describes the graph as directed named and weighted. The graph has 2,939 nodes with 30,501 edges. The name attribute is a vertex of character type, while the weight attribute is an edge level numeric attribute.

Degree distribution

Since our graph is a directed graph, we will look at the total degree distribution considering the graph as undirected. Then we will look at the in-degree and out-degree distributions to better understand the graph at hand, also looking at the mean and standard deviation. ### Directed Flights Graph total-degree distribution

deg <- degree(as.undirected(flights), mode="total")
hist(deg, main="Histogram of Total node degree",xlim=c(0,50), ylim=c(0,1500),breaks = 100)

deg.dist <- degree_distribution(as.undirected(flights), cumulative=T, mode="total")
plot( x=0:max(deg), y=1-deg.dist, pch=19, cex=1.2, col="orange", 
      xlab="Degree", ylab="Cumulative Frequency for total degree", xlim = c(0,100))

sprintf("mean: %f",mean(deg))
## [1] "mean: 10.668255"
sprintf("sd: %f",sd(deg))
## [1] "sd: 21.929753"

Directed Flights Graph in-degree distribution

deg <- degree(as.directed(flights), mode="in")
hist(deg, main="Histogram of Total node degree",xlim=c(0,50), ylim=c(0,1500),breaks = 100)

deg.dist <- degree_distribution(flights, cumulative=T, mode="in")
plot( x=0:max(deg), y=1-deg.dist, pch=19, cex=1.2, col="orange", 
      xlab="Degree", ylab="Cumulative Frequency for in degree", xlim = c(0,100))

sprintf("mean: %f",mean(deg))
## [1] "mean: 10.378020"
sprintf("sd: %f",sd(deg))
## [1] "sd: 21.580055"

Directed Flights Graph out-degree distribution

deg <- degree(as.directed(flights), mode="out")
hist(deg, main="Histogram of Total node degree",xlim=c(0,50), ylim=c(0,1500),breaks = 100)

deg.dist <- degree_distribution(flights, cumulative=T, mode="out")
plot( x=0:max(deg), y=1-deg.dist, pch=19, cex=1.2, col="orange", 
      xlab="Degree", ylab="Cumulative Frequency for out degree", xlim = c(0,100))

sprintf("mean: %f",mean(deg))
## [1] "mean: 10.378020"
sprintf("sd: %f",sd(deg))
## [1] "sd: 21.649405"

As we can see from the charts, our network exhibits a longtail kind of chart. This reflects the fact that many of the airports in our dataset are either really small local airports or privately owned airports. The nodes that show bigger degrees are those that are internarional airports connecting major cities together and with the rest of the world. The biggest airports tend to have a lot of edges, with even higher weights, causing the skewness in the node distribution. Since our graph is directed, it reflects the in degrees reflect the amount of flights towarrds that airport, while the out degree describes the outbound flights. #### Network Diameter Network Diameter, Average Path Length, and the clustering coeffecient without considering weights

sprintf("Flights Network diameter: %d",diameter(flights, directed=T, weights = NA))
## [1] "Flights Network diameter: 17"
print("this is the shortest path, using airport ids, from to go from id 5522 to 7340")
## [1] "this is the shortest path, using airport ids, from to go from id 5522 to 7340"
E(flights, path=get_diameter(flights))
## + 17/30501 edges from ca91f15 (vertex names):
##  [1] 5522->5482 5482->5543 5543->5490 5490->91   91  ->143  143 ->133 
##  [7] 133 ->144  144 ->111  111 ->1382 1382->912  912 ->929  929 ->931 
## [13] 931 ->5619 5619->5618 5618->5616 5616->5621 5621->7430
sprintf("Flights Network Average Path Length: %f",mean_distance(flights, directed=T))
## [1] "Flights Network Average Path Length: 4.145036"

The diameter represents the largerst shortest path in our network! that means, that means the most this is the minimus distance you have to travel, minimum number of airports you have to pass through, if youre willing to take connection flights only to go from one end to the other.

This is a plot of the diameter, or largest shortest distance

diam <- get_diameter(flights, directed=T)
source_diameter <- as.character(metadata[metadata$id == 5522, "city"])
target_diameter <-as.character(metadata[metadata$id == 5621, "city"])

sprintf("Going from %s", source_diameter)
## [1] "Going from Peawanuck"
sprintf("to %s",target_diameter)
## [1] "to Tsiroanomandidy"
sprintf("we have to pass through %s airports!",diameter(flights)-1) #since the last furthermost id does not have a matching name in our airports metadata dataset.
## [1] "we have to pass through 16 airports!"
vcol <- rep("gray40", vcount(flights))
vcol[diam] <- "gold"
ecol <- rep("gray80", ecount(flights))
ecol[E(flights, path=diam)] <- "orange"
E(flights, path=diam) # finds edges along a path, here 'diam'
## + 17/30501 edges from ca91f15 (vertex names):
##  [1] 5522->5482 5482->5543 5543->5490 5490->91   91  ->143  143 ->133 
##  [7] 133 ->144  144 ->111  111 ->1382 1382->912  912 ->929  929 ->931 
## [13] 931 ->5619 5619->5618 5618->5616 5616->5621 5621->7430
plot(flights, vertex.color=vcol, edge.color=ecol, edge.arrow.mode=0, vertex.label= NA)

Calculating the Local and Global clustering coeefecients

sprintf("Flights Network Clustering Coefficient: %f",transitivity(as.undirected(flights),type="global", weights = NA))
## [1] "Flights Network Clustering Coefficient: 0.254718"
sprintf("Flights Network Graph average local clustering coefficient: %f",mean(transitivity(as.directed(flights),type="local", weights = NA), na.rm = T))
## [1] "Flights Network Graph average local clustering coefficient: 0.123600"

Node importance: Centrality measures

deg <- degree(flights, mode="total")
btw <-betweenness(flights)
cls <-closeness(flights)
## Warning in closeness(flights): At centrality.c:2617 :closeness centrality
## is not well-defined for disconnected graphs
centrality_table <- cbind(deg, btw, cls)
centrality_table <- as.data.frame(centrality_table)
centrality_table <- setDT(centrality_table, keep.rownames = TRUE)[]
centrality_table$rn <- as.numeric(centrality_table$rn)
centrality_table <- merge(centrality_table,metadata, by.x = "rn",by.y = "id", all = T)

Ranking Top 20 nodes based with highest Degree

centrality_table[order(centrality_table$deg, decreasing = T),][1:20]
##       rn deg       btw          cls
##  1:  340 473 609655.81 6.186587e-06
##  2: 1382 426 390233.09 6.166559e-06
##  3:  580 395 358465.59 6.172420e-06
##  4: 3364 340 253341.16 6.155627e-06
##  5:  502 339 158156.60 6.147604e-06
##  6: 1701 338 262874.91 6.169640e-06
##  7: 3682 335 317910.95 6.163556e-06
##  8: 2188 331 454134.18 6.176004e-06
##  9: 4029 317 315105.61 6.153657e-06
## 10:  507 315 222811.26 6.155551e-06
## 11: 1229 311 234044.76 6.163632e-06
## 12: 1555 307 182097.58 6.165152e-06
## 13:  346 297 116526.20 6.164886e-06
## 14: 1218 284 119789.97 6.161012e-06
## 15: 3797 282 276848.04 6.172116e-06
## 16:  548 277  86059.93 6.120363e-06
## 17: 3877 277 228709.89 6.125987e-06
## 18: 3885 264 283263.81 6.150062e-06
## 19:  599 260  73546.93 6.140243e-06
## 20: 3406 252 175861.27 6.145828e-06
##                                                 name      city
##  1:                        Frankfurt am Main Airport Frankfurt
##  2:          Charles de Gaulle International Airport     Paris
##  3:                       Amsterdam Airport Schiphol Amsterdam
##  4:            Beijing Capital International Airport   Beijing
##  5:                           London Gatwick Airport    London
##  6:                   Atatürk International Airport  Istanbul
##  7: Hartsfield Jackson Atlanta International Airport   Atlanta
##  8:                      Dubai International Airport     Dubai
##  9:                 Domodedovo International Airport    Moscow
## 10:                          London Heathrow Airport    London
## 11:          Adolfo Suárez Madridâ\200“Barajas Airport    Madrid
## 12:            Leonardo da Vinciâ\200“Fiumicino Airport      Rome
## 13:                                   Munich Airport    Munich
## 14:                  Barcelona International Airport Barcelona
## 15:             John F Kennedy International Airport  New York
## 16:                          London Stansted Airport    London
## 17:                   McCarran International Airport Las Vegas
## 18:                             Suvarnabhumi Airport   Bangkok
## 19:                                   Dublin Airport    Dublin
## 20:            Shanghai Pudong International Airport  Shanghai
##                  country IATA ICAO Latitude   Longitude Altitude Timezone
##  1:              Germany  FRA EDDF 50.03333    8.570556      364        1
##  2:               France  CDG LFPG 49.01280    2.550000      392        1
##  3:          Netherlands  AMS EHAM 52.30860    4.763890      -11        1
##  4:                China  PEK ZBAA 40.08010  116.584999      116        8
##  5:       United Kingdom  LGW EGKK 51.14810   -0.190278      202        0
##  6:               Turkey  ISL LTBA 40.97690   28.814600      163        3
##  7:        United States  ATL KATL 33.63670  -84.428101     1026       -5
##  8: United Arab Emirates  DXB OMDB 25.25280   55.364399       62        4
##  9:               Russia  DME UUDD 55.40880   37.906300      588        3
## 10:       United Kingdom  LHR EGLL 51.47060   -0.461941       83        0
## 11:                Spain  MAD LEMD 40.47193   -3.562640     1998        1
## 12:                Italy  FCO LIRF 41.80028   12.238889       13        1
## 13:              Germany  MUC EDDM 48.35380   11.786100     1487        1
## 14:                Spain  BCN LEBL 41.29710    2.078460       12        1
## 15:        United States  JFK KJFK 40.63980  -73.778900       13       -5
## 16:       United Kingdom  STN EGSS 51.88500    0.235000      348        0
## 17:        United States  LAS KLAS 36.08010 -115.152000     2181       -8
## 18:             Thailand  BKK VTBS 13.68110  100.747002        5        7
## 19:              Ireland  DUB EIDW 53.42130   -6.270070      242        0
## 20:                China  PVG ZSPD 31.14340  121.805000       13        8
##     DST                  TZ    Type      Source
##  1:   E       Europe/Berlin airport OurAirports
##  2:   E        Europe/Paris airport OurAirports
##  3:   E    Europe/Amsterdam airport OurAirports
##  4:   U       Asia/Shanghai airport OurAirports
##  5:   E       Europe/London airport OurAirports
##  6:   E     Europe/Istanbul airport OurAirports
##  7:   A    America/New_York airport OurAirports
##  8:   U          Asia/Dubai airport OurAirports
##  9:   N       Europe/Moscow airport OurAirports
## 10:   E       Europe/London airport OurAirports
## 11:   E       Europe/Madrid airport OurAirports
## 12:   E         Europe/Rome airport OurAirports
## 13:   E       Europe/Berlin airport OurAirports
## 14:   E       Europe/Madrid airport OurAirports
## 15:   A    America/New_York airport OurAirports
## 16:   E       Europe/London airport OurAirports
## 17:   A America/Los_Angeles airport OurAirports
## 18:   U        Asia/Bangkok airport OurAirports
## 19:   E       Europe/Dublin airport OurAirports
## 20:   U       Asia/Shanghai airport OurAirports

Gephi Graphs ### Ranking Top 20 nodes based with highest Betweeness Centrality

centrality_table[order(centrality_table$btw, decreasing = T),][1:30]
##       rn deg      btw          cls
##  1:  340 473 609655.8 6.186587e-06
##  2: 3774  64 465741.4 6.090431e-06
##  3: 2188 331 454134.2 6.176004e-06
##  4: 1382 426 390233.1 6.166559e-06
##  5: 2279 171 368500.3 6.157787e-06
##  6:  580 395 358465.6 6.172420e-06
##  7:  193 234 331555.7 6.169640e-06
##  8:  146 137 324197.2 6.143978e-06
##  9: 3682 335 317910.9 6.163556e-06
## 10: 4029 317 315105.6 6.153657e-06
## 11: 3577 127 294856.7 6.140771e-06
## 12: 3484 234 290454.8 6.162758e-06
## 13: 3885 264 283263.8 6.150062e-06
## 14: 3304 213 280042.8 6.144091e-06
## 15: 3797 282 276848.0 6.172116e-06
## 16: 1701 338 262874.9 6.169640e-06
## 17: 3364 340 253341.2 6.155627e-06
## 18: 3316 225 241419.5 6.140696e-06
## 19: 2241 183 234138.4 6.170440e-06
## 20: 1229 311 234044.8 6.163632e-06
## 21: 3320 101 232511.9 6.108661e-06
## 22: 2564 164 231007.1 6.148360e-06
## 23: 3877 277 228709.9 6.125987e-06
## 24:    5  63 228015.4 6.079323e-06
## 25:  507 315 222811.3 6.155551e-06
## 26: 3494 215 218900.1 6.167357e-06
## 27: 3361 154 217052.9 6.120026e-06
## 28: 2709 134 199058.2 6.121262e-06
## 29: 2276 153 188945.1 6.138434e-06
## 30: 3930 236 186769.6 6.149494e-06
##       rn deg      btw          cls
##                                                                   name
##  1:                                          Frankfurt am Main Airport
##  2:                        Ted Stevens Anchorage International Airport
##  3:                                        Dubai International Airport
##  4:                            Charles de Gaulle International Airport
##  5:                                       Narita International Airport
##  6:                                         Amsterdam Airport Schiphol
##  7:                            Lester B. Pearson International Airport
##  8:            Montreal / Pierre Elliott Trudeau International Airport
##  9:                   Hartsfield Jackson Atlanta International Airport
## 10:                                   Domodedovo International Airport
## 11:                               Seattle Tacoma International Airport
## 12:                                  Los Angeles International Airport
## 13:                                               Suvarnabhumi Airport
## 14:                                 Kuala Lumpur International Airport
## 15:                               John F Kennedy International Airport
## 16:                                     Atatürk International Airport
## 17:                              Beijing Capital International Airport
## 18:                                           Singapore Changi Airport
## 19:                                         Doha International Airport
## 20:                            Adolfo Suárez Madridâ\200“Barajas Airport
## 21:                                     Brisbane International Airport
## 22: Guarulhos - Governador André Franco Montoro International Airport
## 23:                                     McCarran International Airport
## 24:                        Port Moresby Jacksons International Airport
## 25:                                            London Heathrow Airport
## 26:                               Newark Liberty International Airport
## 27:                       Sydney Kingsford Smith International Airport
## 28:                                    El Dorado International Airport
## 29:                               Taiwan Taoyuan International Airport
## 30:                                      Incheon International Airport
##                                                                   name
##             city              country IATA ICAO  Latitude   Longitude
##  1:    Frankfurt              Germany  FRA EDDF  50.03333    8.570556
##  2:    Anchorage        United States  ANC PANC  61.17440 -149.996002
##  3:        Dubai United Arab Emirates  DXB OMDB  25.25280   55.364399
##  4:        Paris               France  CDG LFPG  49.01280    2.550000
##  5:        Tokyo                Japan  NRT RJAA  35.76470  140.386002
##  6:    Amsterdam          Netherlands  AMS EHAM  52.30860    4.763890
##  7:      Toronto               Canada  YYZ CYYZ  43.67720  -79.630600
##  8:     Montreal               Canada  YUL CYUL  45.47060  -73.740799
##  9:      Atlanta        United States  ATL KATL  33.63670  -84.428101
## 10:       Moscow               Russia  DME UUDD  55.40880   37.906300
## 11:      Seattle        United States  SEA KSEA  47.44900 -122.308998
## 12:  Los Angeles        United States  LAX KLAX  33.94250 -118.407997
## 13:      Bangkok             Thailand  BKK VTBS  13.68110  100.747002
## 14: Kuala Lumpur             Malaysia  KUL WMKK   2.74558  101.709999
## 15:     New York        United States  JFK KJFK  40.63980  -73.778900
## 16:     Istanbul               Turkey  ISL LTBA  40.97690   28.814600
## 17:      Beijing                China  PEK ZBAA  40.08010  116.584999
## 18:    Singapore            Singapore  SIN WSSS   1.35019  103.994003
## 19:         Doha                Qatar  DIA OTBD  25.26110   51.565102
## 20:       Madrid                Spain  MAD LEMD  40.47193   -3.562640
## 21:     Brisbane            Australia  BNE YBBN -27.38420  153.117004
## 22:    Sao Paulo               Brazil  GRU SBGR -23.43556  -46.473057
## 23:    Las Vegas        United States  LAS KLAS  36.08010 -115.152000
## 24: Port Moresby     Papua New Guinea  POM AYPY  -9.44338  147.220001
## 25:       London       United Kingdom  LHR EGLL  51.47060   -0.461941
## 26:       Newark        United States  EWR KEWR  40.69250  -74.168701
## 27:       Sydney            Australia  SYD YSSY -33.94610  151.177002
## 28:       Bogota             Colombia  BOG SKBO   4.70159  -74.146900
## 29:       Taipei               Taiwan  TPE RCTP  25.07770  121.233002
## 30:        Seoul          South Korea  ICN RKSI  37.46910  126.450996
##             city              country IATA ICAO  Latitude   Longitude
##     Altitude Timezone DST                   TZ    Type      Source
##  1:      364        1   E        Europe/Berlin airport OurAirports
##  2:      152       -9   A    America/Anchorage airport OurAirports
##  3:       62        4   U           Asia/Dubai airport OurAirports
##  4:      392        1   E         Europe/Paris airport OurAirports
##  5:      141        9   U           Asia/Tokyo airport OurAirports
##  6:      -11        1   E     Europe/Amsterdam airport OurAirports
##  7:      569       -5   A      America/Toronto airport OurAirports
##  8:      118       -5   A      America/Toronto airport OurAirports
##  9:     1026       -5   A     America/New_York airport OurAirports
## 10:      588        3   N        Europe/Moscow airport OurAirports
## 11:      433       -8   A  America/Los_Angeles airport OurAirports
## 12:      125       -8   A  America/Los_Angeles airport OurAirports
## 13:        5        7   U         Asia/Bangkok airport OurAirports
## 14:       69        8   N    Asia/Kuala_Lumpur airport OurAirports
## 15:       13       -5   A     America/New_York airport OurAirports
## 16:      163        3   E      Europe/Istanbul airport OurAirports
## 17:      116        8   U        Asia/Shanghai airport OurAirports
## 18:       22        8   N       Asia/Singapore airport OurAirports
## 19:       35        3   U           Asia/Qatar airport OurAirports
## 20:     1998        1   E        Europe/Madrid airport OurAirports
## 21:       13       10   N   Australia/Brisbane airport OurAirports
## 22:     2459       -3   S    America/Sao_Paulo airport OurAirports
## 23:     2181       -8   A  America/Los_Angeles airport OurAirports
## 24:      146       10   U Pacific/Port_Moresby airport OurAirports
## 25:       83        0   E        Europe/London airport OurAirports
## 26:       18       -5   A     America/New_York airport OurAirports
## 27:       21       10   O     Australia/Sydney airport OurAirports
## 28:     8361       -5   U       America/Bogota airport OurAirports
## 29:      106        8   U          Asia/Taipei airport OurAirports
## 30:       23        9   U           Asia/Seoul airport OurAirports
##     Altitude Timezone DST                   TZ    Type      Source
Gephi Graphs

Gephi Graphs

The interesting point taken in this chart is Anchorage, Alaska. Although geographically one would think that maybe this has high betweeness as it acts a bridge between the far east and the west coast of the USA, this is not the case. After further analysis, this node acts as a bridge connecting all Alaskan airports and some Canadian airports to the rest of the USA, and therefore the rest of the world.

Ranking Top 20 nodes based with highest Closeness Centrality

centrality_table[order(centrality_table$cls, decreasing = T),][1:30]
##       rn deg       btw          cls
##  1:  340 473 609655.81 6.186587e-06
##  2: 2188 331 454134.18 6.176004e-06
##  3:  580 395 358465.59 6.172420e-06
##  4: 3797 282 276848.04 6.172116e-06
##  5: 2241 183 234138.40 6.170440e-06
##  6:  193 234 331555.67 6.169640e-06
##  7: 1701 338 262874.91 6.169640e-06
##  8: 3494 215 218900.10 6.167357e-06
##  9: 1382 426 390233.09 6.166559e-06
## 10: 1678 245 113311.03 6.165380e-06
## 11: 1555 307 182097.58 6.165152e-06
## 12:  346 297 116526.20 6.164886e-06
## 13: 1229 311 234044.76 6.163632e-06
## 14: 3714 139 125238.64 6.163632e-06
## 15: 3682 335 317910.95 6.163556e-06
## 16: 3484 234 290454.82 6.162758e-06
## 17: 1218 284 119789.97 6.161012e-06
## 18: 4353   1      0.00 6.158394e-06
## 19: 2279 171 368500.29 6.157787e-06
## 20: 1524 222  84419.06 6.157332e-06
## 21: 2985 235 153089.70 6.156422e-06
## 22: 3364 340 253341.16 6.155627e-06
## 23:  507 315 222811.26 6.155551e-06
## 24: 2179 148  83773.71 6.154793e-06
## 25: 4029 317 315105.61 6.153657e-06
## 26:  345 241  63925.95 6.153240e-06
## 27:  302 226  89208.89 6.152029e-06
## 28:  609 228 163821.18 6.151840e-06
## 29: 3670 212 122915.02 6.151650e-06
## 30:  478 241 110645.07 6.150402e-06
##       rn deg       btw          cls
##                                                 name              city
##  1:                        Frankfurt am Main Airport         Frankfurt
##  2:                      Dubai International Airport             Dubai
##  3:                       Amsterdam Airport Schiphol         Amsterdam
##  4:             John F Kennedy International Airport          New York
##  5:                       Doha International Airport              Doha
##  6:          Lester B. Pearson International Airport           Toronto
##  7:                   Atatürk International Airport          Istanbul
##  8:             Newark Liberty International Airport            Newark
##  9:          Charles de Gaulle International Airport             Paris
## 10:                                  Zürich Airport            Zurich
## 11:            Leonardo da Vinciâ\200“Fiumicino Airport              Rome
## 12:                                   Munich Airport            Munich
## 13:          Adolfo Suárez Madridâ\200“Barajas Airport            Madrid
## 14:          Washington Dulles International Airport        Washington
## 15: Hartsfield Jackson Atlanta International Airport           Atlanta
## 16:                Los Angeles International Airport       Los Angeles
## 17:                  Barcelona International Airport         Barcelona
## 18:                          Anapa Vityazevo Airport             Anapa
## 19:                     Narita International Airport             Tokyo
## 20:                   Malpensa International Airport            Milano
## 21:               Sheremetyevo International Airport            Moscow
## 22:            Beijing Capital International Airport           Beijing
## 23:                          London Heathrow Airport            London
## 24:                  Abu Dhabi International Airport         Abu Dhabi
## 25:                 Domodedovo International Airport            Moscow
## 26:                              Düsseldorf Airport       Duesseldorf
## 27:                                 Brussels Airport          Brussels
## 28:                       Copenhagen Kastrup Airport        Copenhagen
## 29:          Dallas Fort Worth International Airport Dallas-Fort Worth
## 30:                               Manchester Airport        Manchester
##                                                 name              city
##                  country IATA ICAO Latitude   Longitude Altitude Timezone
##  1:              Germany  FRA EDDF 50.03333    8.570556      364        1
##  2: United Arab Emirates  DXB OMDB 25.25280   55.364399       62        4
##  3:          Netherlands  AMS EHAM 52.30860    4.763890      -11        1
##  4:        United States  JFK KJFK 40.63980  -73.778900       13       -5
##  5:                Qatar  DIA OTBD 25.26110   51.565102       35        3
##  6:               Canada  YYZ CYYZ 43.67720  -79.630600      569       -5
##  7:               Turkey  ISL LTBA 40.97690   28.814600      163        3
##  8:        United States  EWR KEWR 40.69250  -74.168701       18       -5
##  9:               France  CDG LFPG 49.01280    2.550000      392        1
## 10:          Switzerland  ZRH LSZH 47.46470    8.549170     1416        1
## 11:                Italy  FCO LIRF 41.80028   12.238889       13        1
## 12:              Germany  MUC EDDM 48.35380   11.786100     1487        1
## 13:                Spain  MAD LEMD 40.47193   -3.562640     1998        1
## 14:        United States  IAD KIAD 38.94450  -77.455803      312       -5
## 15:        United States  ATL KATL 33.63670  -84.428101     1026       -5
## 16:        United States  LAX KLAX 33.94250 -118.407997      125       -8
## 17:                Spain  BCN LEBL 41.29710    2.078460       12        1
## 18:               Russia  AAQ URKA 45.00210   37.347301      174        3
## 19:                Japan  NRT RJAA 35.76470  140.386002      141        9
## 20:                Italy  MXP LIMC 45.63060    8.728110      768        1
## 21:               Russia  SVO UUEE 55.97260   37.414600      622        3
## 22:                China  PEK ZBAA 40.08010  116.584999      116        8
## 23:       United Kingdom  LHR EGLL 51.47060   -0.461941       83        0
## 24: United Arab Emirates  AUH OMAA 24.43300   54.651100       88        4
## 25:               Russia  DME UUDD 55.40880   37.906300      588        3
## 26:              Germany  DUS EDDL 51.28950    6.766780      147        1
## 27:              Belgium  BRU EBBR 50.90140    4.484440      184        1
## 28:              Denmark  CPH EKCH 55.61790   12.656000       17        1
## 29:        United States  DFW KDFW 32.89680  -97.038002      607       -6
## 30:       United Kingdom  MAN EGCC 53.35370   -2.274950      257        0
##                  country IATA ICAO Latitude   Longitude Altitude Timezone
##     DST                  TZ    Type      Source
##  1:   E       Europe/Berlin airport OurAirports
##  2:   U          Asia/Dubai airport OurAirports
##  3:   E    Europe/Amsterdam airport OurAirports
##  4:   A    America/New_York airport OurAirports
##  5:   U          Asia/Qatar airport OurAirports
##  6:   A     America/Toronto airport OurAirports
##  7:   E     Europe/Istanbul airport OurAirports
##  8:   A    America/New_York airport OurAirports
##  9:   E        Europe/Paris airport OurAirports
## 10:   E       Europe/Zurich airport OurAirports
## 11:   E         Europe/Rome airport OurAirports
## 12:   E       Europe/Berlin airport OurAirports
## 13:   E       Europe/Madrid airport OurAirports
## 14:   A    America/New_York airport OurAirports
## 15:   A    America/New_York airport OurAirports
## 16:   A America/Los_Angeles airport OurAirports
## 17:   E       Europe/Madrid airport OurAirports
## 18:   N       Europe/Moscow airport OurAirports
## 19:   U          Asia/Tokyo airport OurAirports
## 20:   E         Europe/Rome airport OurAirports
## 21:   N       Europe/Moscow airport OurAirports
## 22:   U       Asia/Shanghai airport OurAirports
## 23:   E       Europe/London airport OurAirports
## 24:   U          Asia/Dubai airport OurAirports
## 25:   N       Europe/Moscow airport OurAirports
## 26:   E       Europe/Berlin airport OurAirports
## 27:   E     Europe/Brussels airport OurAirports
## 28:   E   Europe/Copenhagen airport OurAirports
## 29:   A     America/Chicago airport OurAirports
## 30:   E       Europe/London airport OurAirports
##     DST                  TZ    Type      Source

However, here we decided to look further into the cities, and decided to join arirports from the same cities.

citiesgraph <- graph.data.frame(fread("citiesgraph.csv"))
cls <-closeness(citiesgraph)
## Warning in closeness(citiesgraph): At centrality.c:2784 :closeness
## centrality is not well-defined for disconnected graphs
sort(cls, decreasing = T)[1:20]
##       London    Frankfurt       Koumac        Paris    Amsterdam 
## 9.905698e-06 9.892273e-06 9.891784e-06 9.888947e-06 9.869622e-06 
##     New York        Dubai         Rome      Toronto  Los Angeles 
## 9.869525e-06 9.867674e-06 9.853576e-06 9.850664e-06 9.849209e-06 
##     Istanbul        Anapa      Bangkok        Tokyo       Moscow 
## 9.846299e-06 9.844264e-06 9.842132e-06 9.841648e-06 9.840970e-06 
##      Beijing       Munich        Seoul      Atlanta       Zurich 
## 9.839033e-06 9.837872e-06 9.837485e-06 9.836904e-06 9.835066e-06
Gephi Graphs

Gephi Graphs

Now after joining cities together, we see that London overtook Frankfurt. This makes semse, as frankfurt has only 2 airports, and is not even the capital of the country. 2 major world airports are not far away in the same country (Munich and Berlin). whereas London has around 5 airports for internal flights within the UK, low cost flights within Europe.

#Clustering Community Detection Gephi Graphs

When we clustered using modulariy coeficient on Gephi, we the following clusters. We then decided to see the geographical location of the nodes represented in theses clusters to be able to visualize and understand fully the results.

Gephi Graphs

Gephi Graphs

Not surprisingle, the clusters where close to each other geographically. This makes complete sense, since airports with the same continents tend to be more connected to each other. Although we have some countries belonging to different continents in the same cluster, this reinforces the connections between the airports in those countries. The most obvious examples are the ones of the northern countries of south america. The cluster points out that Venezuela is might be well connected to US states. Other isolates such as Canada and Alaska reinforce the idea of communities within each other. The clusters can be labeled as North America, South America, Europe, Africa, Middle East India, Far Easr Asia, Oceania, Central Asia.

It is easy to distinguish clusters when they are spreaded across the map, however, lets look at them now without using the geo-layout plugin. Gephi Graphs

The clusters here appear in a different way. We can deduce that South America, Africa and the Middle East, are connected to the rest of the world by few hubs like Sao Paolo, Cairo and Marakesh, Dubai respectively. Those are the nodes or cities in that case that connects these geogrpahical locations to other areas. Whereas we can see that Europel, lying there in the middle is more attached to the entire world! As we analyzed before that nodes with highest degrees and many of thise in the top of the betweenness table belong to Europe. Geographicall speaking, it also makes sense. Europe is closer to the 6 all the main regions than other areas around the globe.

4. Community detection

Calculate the clusters using louvain algorithm.

cl <- cluster_louvain(as.undirected(flights))

plot(cl, as.undirected(flights), vertex.label =NA)

modularity(cl)
## [1] 0.6535182
cfg <- cluster_fast_greedy(as.undirected(flights))

plot(cfg, as.undirected(flights), vertex.label = NA)

modularity(cfg)
## [1] 0.6049943

Community detection based on based on propagating labels Assigns node labels, randomizes, than replaces each vertex’s label with the label that appears most frequently among neighbors. Those steps are repeated until each vertex has the most common label of its neighbors.

clp <- cluster_label_prop(as.undirected(citiesgraph))
plot(clp, as.undirected(citiesgraph), vertex.label = NA)

colrs <- adjustcolor( c("gray50", "tomato", "gold", "yellowgreen"), alpha=.6)
kc <- coreness(flights, mode="all")
plot(flights, vertex.size=kc*6, vertex.label=kc, vertex.color=colrs[kc], vertex.label = NA)

LouvainCluster <- cluster_louvain(as.undirected(flights))
plot(LouvainCluster, as.undirected(flights))

#Community detection based on greedy optimization of modularity
cfg <- cluster_fast_greedy(as.undirected(flights))
plot(cfg, as.undirected(flights), vertex.labels = NA)

How to use this data

Airports have huge impact on the countries economy. It brings in Revenues from airport taxes and from commerical spendings inside the airport shops. We want to implement our insights to help take advantage of the benefits and maybe help cities grow and maximixe their potential Revenues and economic impacts by giving suggestions on how to leverage this graph Network.

First, let us understand the impact of airports on the economy by looking at numbers. Revenues

We see that in 2018, Spain lied second by revenues generated from airports. However we believe that this is not enough. Spain is not living up to its potential.

Number of Passengers

Number of Passengers

We would like to show our support for building another airport here in Madrid, do that Spain to be part of this list on the upcoming years. From our experience as citizrns of Madrid, the Barajas Airport is relatively expensive. Flights to major world cities, specially neighbooring cities are relatively more expensive here in Madrid, when compared to other major European cities, and even other Spanish Airports.

We believe, if Spain successfully leverages its political connections and uses its geographical location to its advantage, it can make Madrid a bigger hub for travelling passengers. Politically, Spain has very strong relationships with South American countries. Geographically, Spain is the closest European country to Africa. As we saw previously in our charts, Africa and South America are not as strongly connected to the rest of the world, in comparison to other continents.

Building a new airport can free up Barajas and transform into a bigger betweenness hub. A new airport focused on cheap local and regional flights has many advantages. The amount invested in building a new airport is huge enough to boost the Madrid economy. Several job oportunites will be available at the disposal of the people of Madrid. Increased local and regional toursim plus increased airport taxes will have a positive impact on the entire city in general. Now, Barajas can be transformed into an International Betweenness airport, catching up with neighbouring cities such as Frankfurt, Paris, and London.